AITopics | recursive training

Collaborating Authors

recursive training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning to Self-Train for Semi-Supervised Few-Shot Classification

Xinzhe Li, Qianru Sun, Yaoyao Liu, Qin Zhou, Shibao Zheng, Tat-Seng Chua, Bernt Schiele

Neural Information Processing SystemsFeb-13-2026, 22:01:49 GMT

Neural Information Processing Systems http://nips.cc/

classification, learning, unlabeled data, (15 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)
(3 more...)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A theoretical basis for model collapse in recursive training

Borkar, Vivek Shripad

arXiv.org Artificial IntelligenceSep-30-2025

Our analysis will draw heavily upon the three topics in probability theory mentioned above. We briefly summarize the relevant results here. These can be found respectively, in [3] (see also [12] for a more extensive treatment), [11], and [2] (see also [9] for a more extensive treatment), respectively. A. Convergence of probability measures: Let S be a Polish space, i.e., a separable topological space with its topology compatible with a complete metric. Let B denote its Borel σ -field, i.e., the smallest σ -field containing its open sets.

artificial intelligence, machine learning, markov chain, (15 more...)

arXiv.org Artificial Intelligence

2506.09401

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

Add feedback

Knowledge Collapse in LLMs: When Fluency Survives but Facts Fail under Recursive Synthetic Training

Keisha, Figarri, Wu, Zekun, Wang, Ze, Koshiyama, Adriano, Treleaven, Philip

arXiv.org Artificial IntelligenceSep-8-2025

Large language models increasingly rely on synthetic data due to human-written content scarcity, yet recursive training on model-generated outputs leads to model collapse, a degenerative process threatening factual reliability. We define knowledge collapse as a distinct three-stage phenomenon where factual accuracy deteriorates while surface fluency persists, creating "confidently wrong" outputs that pose critical risks in accuracy-dependent domains. Through controlled experiments with recursive synthetic training, we demonstrate that collapse trajectory and timing depend critically on instruction format, distinguishing instruction-following collapse from traditional model collapse through its conditional, prompt-dependent nature. We propose domain-specific synthetic training as a targeted mitigation strategy that achieves substantial improvements in collapse resistance while maintaining computational efficiency. Our evaluation framework combines model-centric indicators with task-centric metrics to detect distinct degradation phases, enabling reproducible assessment of epistemic deterioration across different language models. These findings provide both theoretical insights into collapse dynamics and practical guidance for sustainable AI training in knowledge-intensive applications where accuracy is paramount.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.04796

Country:

North America > United States (0.28)
Indian Ocean (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.93)
Government (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Machine-generated text detection prevents language model collapse

Drayson, George, Lampos, Vasileios

arXiv.org Artificial IntelligenceMar-16-2025

As Large Language Models (LLMs) become increasingly prevalent, their generated outputs are proliferating across the web, risking a future where machine-generated content dilutes human-authored text. Since online data is the primary resource for LLM pre-training, subsequent models could be trained on an unknown portion of synthetic samples. This will lead to model collapse, a degenerative process whereby LLMs reinforce their own errors, and ultimately yield a declining performance. In this study, we investigate the impact of decoding strategy on model collapse, analysing the characteristics of text at each model generation, the similarity to human references, and the resulting model performance. Using the decoding strategies that lead to the most significant degradation, we evaluate model collapse in more realistic scenarios where the origin of the data (human or synthetic) is unknown. We train a machine-generated text detector and propose an importance sampling approach to alleviate model collapse. Our method is validated on two LLM variants (GPT-2 and SmolLM2) on the open-ended text generation task. We demonstrate that it can not only prevent model collapse but also improve performance when sufficient human-authored samples are present.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.15654

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Recursive Training of 2D-3D Convolutional Networks for Neuronal Boundary Prediction

Neural Information Processing SystemsOct-11-2024, 12:17:42 GMT

Efforts to automate the reconstruction of neural circuits from 3D electron microscopic (EM) brain images are critical for the field of connectomics. An important computation for reconstruction is the detection of neuronal boundaries. Images acquired by serial section EM, a leading 3D EM technique, are highly anisotropic, with inferior quality along the third dimension. For such images, the 2D max-pooling convolutional network has set the standard for performance at boundary detection. Here we achieve a substantial gain in accuracy through three innovations.

convolutional network, neuronal boundary prediction, recursive training, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.63)

Add feedback

How Bad is Training on Synthetic Data? A Statistical Analysis of Language Model Collapse

Seddik, Mohamed El Amine, Chen, Suei-Wen, Hayou, Soufiane, Youssef, Pierre, Debbah, Merouane

arXiv.org Artificial IntelligenceApr-7-2024

The phenomenon of model collapse, introduced in (Shumailov et al., 2023), refers to the deterioration in performance that occurs when new models are trained on synthetic data generated from previously trained models. This recursive training loop makes the tails of the original distribution disappear, thereby making future-generation models forget about the initial (real) distribution. With the aim of rigorously understanding model collapse in language models, we consider in this paper a statistical model that allows us to characterize the impact of various recursive training scenarios. Specifically, we demonstrate that model collapse cannot be avoided when training solely on synthetic data. However, when mixing both real and synthetic data, we provide an estimate of a maximal amount of synthetic data below which model collapse can eventually be avoided. Our theoretical conclusions are further supported by empirical validations.

language model, model collapse, synthetic data, (16 more...)

arXiv.org Artificial Intelligence

2404.0509

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.15)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Recursive Training of 2D-3D Convolutional Networks for Neuronal Boundary Prediction

Lee, Kisuk, Zlateski, Aleksandar, Ashwin, Vishwanathan, Seung, H. Sebastian

Neural Information Processing SystemsFeb-14-2020, 15:11:24 GMT

artificial intelligence, convolutional network, neuronal boundary prediction, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.67)

Add feedback

Learning to Self-Train for Semi-Supervised Few-Shot Classification

Sun, Qianru, Li, Xinzhe, Liu, Yaoyao, Zheng, Shibao, Chua, Tat-Seng, Schiele, Bernt

arXiv.org Machine LearningJun-3-2019

Few-shot classification (FSC) is challenging due to the scarcity of labeled training data (e.g. only one labeled data point per class). Meta-learning has shown to achieve promising results by learning to initialize a classification model for FSC. In this paper we propose a novel semi-supervised meta-learning method called learning to self-train (LST) that leverages unlabeled data and specifically meta-learns how to cherry-pick and label such unsupervised data to further improve performance. To this end, we train the LST model through a large number of semi-supervised few-shot tasks. On each task, we train a few-shot model to predict pseudo labels for unlabeled data, and then iterate the self-training steps on labeled and pseudo-labeled data with each step followed by fine-tuning. We additionally learn a soft weighting network (SWN) to optimize the self-training weights of pseudo labels so that better ones can contribute more to gradient descent optimization. We evaluate our LST method on two ImageNet benchmarks for semi-supervised few-shot classification and achieve large improvements over the state-of-the-art.

artificial intelligence, learning, machine learning, (17 more...)

arXiv.org Machine Learning

1906.00562

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Saarland (0.04)
Asia > Singapore (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback